Operating HAWKI RAG with Makefile
Platform defaults from Makefile
Base compose is always docker-compose.yml, and Makefile exports COMPOSE_FILE before calling docker compose.
| Linux | macOS |
|---|---|
Default USE_OLLAMA_GPU is auto. | Default USE_OLLAMA_GPU is 0. |
If nvidia-smi exists, docker-compose-gpu-override.yml is added automatically. | Runs in CPU mode by default (no GPU override). |
Effective COMPOSE_FILE: docker-compose.yml:docker-compose-gpu-override.yml (when GPU is detected). | Effective COMPOSE_FILE: docker-compose.yml. |
Key overrides (per run)
USE_OLLAMA_GPU:auto(default): detect GPU on Linux.1: force GPU override.0: force CPU mode.
ENV_FILE(default.env): choose env file.CRAWLED_ROOT(default/app/shared): ingest root formake ingest.COMPOSE_PROFILES: optional profile toggle (e.g.gputo includeraganything_api_gpu).BASE_COMPOSE_FILE/GPU_OVERRIDE_COMPOSE: advanced override of compose filenames.
Examples:
# Force CPU mode
USE_OLLAMA_GPU=0 make up-core
# Force GPU override
USE_OLLAMA_GPU=1 make up-core
# Start with profile-gated GPU API too
USE_OLLAMA_GPU=1 COMPOSE_PROFILES=gpu make up-core
Compose/Dockerfile roles
docker-compose.yml:- CPU-safe base stack and the default
ollamaservice.
- CPU-safe base stack and the default
docker-compose-gpu-override.yml:- Overrides only
ollamato CUDA build + NVIDIA device reservation.
- Overrides only
docker/laravel.Dockerfile: buildshawki_rag_app.Dockerfile: buildshawki_rag_bridge(python-ragtarget) andhawki_rag_rerank(reranktarget).docker/qdrant.Dockerfile: extendsqdrant/qdrantand installscurlfor health checks.
One-time networks
make network # creates shared docker networks hawki-network + hosting_network
Run this once per machine (or after pruning Docker networks). Safe to rerun.
Start stack
make up-core
What make up-core does:
| Step | What happens |
|---|---|
| Compose context | Uses computed COMPOSE_FILE with ENV_FILE and optional COMPOSE_PROFILES. |
| Launch preview | Prints selected compose files before startup. |
| Model readiness | Pulls Ollama models: bge-m3, llama3.1:8b, llama3.2:1b. |
Model pulls (Ollama)
- Default pulls:
bge-m3,llama3.1:8b,llama3.2:1b. - Optional (manual):
llama3.2:3bdocker exec hawki_ollama ollama pull llama3.2:3b
- Rough VRAM guide:
bge-m3< 4 GB,llama3.2:1b~2 GB,llama3.1:8bprefers 12-16 GB.
Health and logs
make test-services # curl checks for Qdrant, Neo4j, bridge, reranker
make logs-core # follow compose logs
Ingest content (inside containers, internal URLs)
docker exec hawki_rag_bridge sh -lc "python /app/ingest/ingest_crawled.py \
--root /app/shared/<folder> \
--base-url http://localhost:8000 \
--provider ollama --graph --batch 16"
Shared volume path mapping
Path mapping: rawki_shared_storage (Docker volume) -> /app/shared (bridge) and /var/www/storage/app/public (Laravel app).
Shut down / reset
make down-core
make down-rag
make neo4j-fresh # stops Neo4j, wipes /data, restarts clean graph
Troubleshooting tips for Make targets
- If pulls are slow: pre-pull with
docker compose pullor check VPN/proxy. - If Ollama pulls hang: pull manually in
hawki_ollama. - If GPU is expected but not detected on Linux: install
nvidia-container-toolkitand restart Docker, or force CPU mode withUSE_OLLAMA_GPU=0.